48 research outputs found

    Enhancing Security Patch Identification by Capturing Structures in Commits

    Full text link
    With the rapid increasing number of open source software (OSS), the majority of the software vulnerabilities in the open source components are fixed silently, which leads to the deployed software that integrated them being unable to get a timely update. Hence, it is critical to design a security patch identification system to ensure the security of the utilized software. However, most of the existing works for security patch identification just consider the changed code and the commit message of a commit as a flat sequence of tokens with simple neural networks to learn its semantics, while the structure information is ignored. To address these limitations, in this paper, we propose our well-designed approach E-SPI, which extracts the structure information hidden in a commit for effective identification. Specifically, it consists of the code change encoder to extract the syntactic of the changed code with the BiLSTM to learn the code representation and the message encoder to construct the dependency graph for the commit message with the graph neural network (GNN) to learn the message representation. We further enhance the code change encoder by embedding contextual information related to the changed code. To demonstrate the effectiveness of our approach, we conduct the extensive experiments against six state-of-the-art approaches on the existing dataset and from the real deployment environment. The experimental results confirm that our approach can significantly outperform current state-of-the-art baselines

    EasyNet: An Easy Network for 3D Industrial Anomaly Detection

    Full text link
    3D anomaly detection is an emerging and vital computer vision task in industrial manufacturing (IM). Recently many advanced algorithms have been published, but most of them cannot meet the needs of IM. There are several disadvantages: i) difficult to deploy on production lines since their algorithms heavily rely on large pre-trained models; ii) hugely increase storage overhead due to overuse of memory banks; iii) the inference speed cannot be achieved in real-time. To overcome these issues, we propose an easy and deployment-friendly network (called EasyNet) without using pre-trained models and memory banks: firstly, we design a multi-scale multi-modality feature encoder-decoder to accurately reconstruct the segmentation maps of anomalous regions and encourage the interaction between RGB images and depth images; secondly, we adopt a multi-modality anomaly segmentation network to achieve a precise anomaly map; thirdly, we propose an attention-based information entropy fusion module for feature fusion during inference, making it suitable for real-time deployment. Extensive experiments show that EasyNet achieves an anomaly detection AUROC of 92.6% without using pre-trained models and memory banks. In addition, EasyNet is faster than existing methods, with a high frame rate of 94.55 FPS on a Tesla V100 GPU

    TransRepair: Context-aware Program Repair for Compilation Errors

    Full text link
    Automatically fixing compilation errors can greatly raise the productivity of software development, by guiding the novice or AI programmers to write and debug code. Recently, learning-based program repair has gained extensive attention and became the state-of-the-art in practice. But it still leaves plenty of space for improvement. In this paper, we propose an end-to-end solution TransRepair to locate the error lines and create the correct substitute for a C program simultaneously. Superior to the counterpart, our approach takes into account the context of erroneous code and diagnostic compilation feedback. Then we devise a Transformer-based neural network to learn the ways of repair from the erroneous code as well as its context and the diagnostic feedback. To increase the effectiveness of TransRepair, we summarize 5 types and 74 fine-grained sub-types of compilations errors from two real-world program datasets and the Internet. Then a program corruption technique is developed to synthesize a large dataset with 1,821,275 erroneous C programs. Through the extensive experiments, we demonstrate that TransRepair outperforms the state-of-the-art in both single repair accuracy and full repair accuracy. Further analysis sheds light on the strengths and weaknesses in the contemporary solutions for future improvement.Comment: 11 pages, accepted to ASE '2

    Real3D-AD: A Dataset of Point Cloud Anomaly Detection

    Full text link
    High-precision point cloud anomaly detection is the gold standard for identifying the defects of advancing machining and precision manufacturing. Despite some methodological advances in this area, the scarcity of datasets and the lack of a systematic benchmark hinder its development. We introduce Real3D-AD, a challenging high-precision point cloud anomaly detection dataset, addressing the limitations in the field. With 1,254 high-resolution 3D items from forty thousand to millions of points for each item, Real3D-AD is the largest dataset for high-precision 3D industrial anomaly detection to date. Real3D-AD surpasses existing 3D anomaly detection datasets available regarding point cloud resolution (0.0010mm-0.0015mm), 360 degree coverage and perfect prototype. Additionally, we present a comprehensive benchmark for Real3D-AD, revealing the absence of baseline methods for high-precision point cloud anomaly detection. To address this, we propose Reg3D-AD, a registration-based 3D anomaly detection method incorporating a novel feature memory bank that preserves local and global representations. Extensive experiments on the Real3D-AD dataset highlight the effectiveness of Reg3D-AD. For reproducibility and accessibility, we provide the Real3D-AD dataset, benchmark source code, and Reg3D-AD on our website:https://github.com/M-3LAB/Real3D-AD

    Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning

    Full text link
    According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.Comment: Accepted to 42nd International Conference on Software Engineerin

    Trends and Patterns of Disparities in Burden of Lung Cancer in the United States, 1974-2015

    Get PDF
    Background: Although lung cancer incidence and mortality have been declining since the 1990s, the extent to which such progress has been made is unequal across population segments. Updated epidemiologic data on trends and patterns of disparities are lacking.Methods: Data on lung cancer cases and deaths during 1974 to 2015 were extracted from the Surveillance, Epidemiology, and End Results program. Age-standardized lung cancer incidence and mortality and their annual percent changes were calculated by histologic types, demographic variables, and tumor characteristics.Results: Lung cancer incidence decreased since 1990 (1990 to 2007: annual percent change, −0.9 [95% CI, −1.0%, −0.8%]; 2007 to 2015: −2.6 [−2.9%, −2.2%]). Among adults aged between 20 and 39 years, a higher incidence was observed among females during 1995 to 2011, after which a faster decline in female lung cancer incidence (males: −2.5% [−2.8%, −2.2%]; females: −3.1% [−4.7%, −1.5%]) resulted in a lower incidence among females. The white population had a higher incidence than the Black population for small cell carcinoma since 1987. Black females were the only group whose adenocarcinoma incidence plateaued since 2012 (−5.0% [−13.0%, 3.7%]). A higher incidence for squamous cell carcinoma was observed among Black males and females than among white males and females during 1974 to 2015. After circa 2005, octogenarians and older patients constituted the group with the highest lung cancer incidence. Incidence for localized and AJCC/TNM stage I lung cancer among octogenarians and older patients plateaued since 2009, while mortality continued to rise (localized: 1.4% [0.6%, 2.1%]; stage I: 6.7% [4.5%, 9.0%]).Conclusions: Lung cancer disparities prevail across population segments. Our findings inform effective approaches to eliminate lung cancer disparities by targeting at-risk populations

    Discrete element modeling of the machining processes of brittle materials: recent development and future prospective

    Get PDF

    From market to device : adaptive and efficient malware detection for Android

    No full text
    In the past few years, the market share ratio of Android System has been increased to a leading position. With that large user basis, the number of Android applications on Google Play has increased to 3 million till the year of 2018. However, not all of the applications in market can be surely prevented from security risks. API misuse and incorrect invocation by developers may cause significant data leakage or tangibly degrade user experience, etc. Meanwhile, due to the complexity of Android system and diversity of real usage scenarios, it is quite a challenge to solve all these problems within a strait forward way. Thus, we set our targets on providing solutions for the Android security problems towards different usage scenarios separately. As we know, a precise representation for attacks can benefit the detection of malware in both accuracy and efficiency. However, it is still far from expectation to describe attacks precisely on the Android platform. In addition, new features on Android, such as communication mechanisms, introduce new challenges and difficulties for attack detection. Considering to solve the addressed problems by the side of service provider and security researcher, we propose abstract attack models to precisely capture the semantics of various Android attacks, which include the corresponding targets, involved behaviors as well as their execution dependency. Meanwhile, we construct a novel graph-based model called ICCG (Inter-component Communication Graph) to describe the internal control flows and inter-component communications of applications. The models take into account more communication channel with a maximized preservation of their program logics. With the guidance of the attack models, we propose a static searching approach to detect attacks hidden in ICCG. To reduce false positive rate, we introduce an additional dynamic confirmation step to check whether the detected attacks are false alarms. Experiments show that our integrated malware detection system, DroidEcho, can detect attacks in both benchmark and real-world applications effectively and efficiently with a precision of 89.5%. However, apart from the applications provided by the official market (i.e., Google Play Store), which can adopt a heavy and complicated detection approach (e.g., DroidEcho), apps from unofficial markets and third-party resources are always causing serious security threats to end-users. Meanwhile, it is a time-consuming task if the app is downloaded first and then uploaded to the server side for detection, because the network transmission has a lot of overhead. In addition, the uploading process also suffers from the threat of attackers. Consequently, a last line of defense on mobile devices is necessary and much-needed. To address this problem, we propose an effective Android malware detection system, MobiTive, leveraging customized deep neural networks to provide a real-time and responsive detection environment on mobile devices. MobiTive is a pre-installed solution rather than an app scanning and monitoring engine using after installation, which is more practical and secure. Although a deep learning-based approach can be maintained on server side efficiently for malware detection, original deep learning models cannot be directly deployed and executed on mobile devices due to various performance limitations, such as computation power, memory size, and energy. Therefore, we evaluate and investigate the following key points: (1) the performance of different feature extraction methods based on source code or binary code; (2) the performance of different feature type selections for deep learning on mobile devices; (3) the detection accuracy of different deep neural networks on mobile devices; (4) the real-time detection performance and accuracy on different mobile devices; (5) the potential based on the evolution trend of mobile devices' specifications; and finally we further propose a practical solution (MobiTive) to detect Android malware on mobile devices. Based on the evaluations and findings on MobiTive, we find that syntax features, such as permissions and API calls, lack the semantics which can represent the potential malicious behaviors and further result in more robust model with high accuracy for malware detection. We further propose an efficient Android malware detection system, named SeqMobile, which adopts behavior-based sequence features and leverages customized deep neural networks on mobile devices instead of the server end. Different from the traditional sequence-based approaches on server end, to meet the performance demand on mobile devices, SeqMobile accepts three effective performance optimization methods to reduce the time of feature extraction and prediction. To evaluate the effectiveness and efficiency of our system, we conduct experiments from the following aspects 1) the detection accuracy of different recurrent neural networks (RNN); 2) the feature extraction performance on different mobile devices, and 3) the detection accuracy and prediction time cost of different sequence lengths. The results unveil that SeqMobile can effectively detect malware with high accuracy. Moreover, our performance optimization methods have proven to improve the performance of training and prediction by at least twofold. Additionally, to discover the potential performance optimization from the state-of-the-art TensorFlow model optimization toolkit for our sequence-based approach, we also provide an evaluation on the toolkit, which can serve as a guidance for other systems leveraging on sequence-based learning approach. Overall, we conclude that our sequence-based approach, together with our performance optimization methods, enable us to efficiently detect malware under the performance demands of mobile devices.Doctor of Philosoph

    Region-by-Region Registration Combining Feature-Based and Optical Flow Methods for Remote Sensing Images

    No full text
    While geometric registration has been studied in remote sensing community for many decades, successful cases are rare, which register images allowing for local inconsistency deformation caused by topographic relief. Toward this end, a region-by-region registration combining the feature-based and optical flow methods is proposed. The proposed framework establishes on the calculation of pixel-wise displacement and mosaic of displacement fields. Concretely, the initial displacement fields for a pair of images are calculated by the block-weighted projective model and Brox optical flow estimation, respectively in the flat- and complex-terrain regions. The abnormal displacements resulting from the sensitivity of optical flow in the land use or land cover changes, are adaptively detected and corrected by the weighted Taylor expansion. Subsequently, the displacement fields are mosaicked seamlessly for subsequent steps. Experimental results show that the proposed method outperforms comparative algorithms, achieving the highest registration accuracy qualitatively and quantitatively
    corecore